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Abstract 

We present an approach for dynamic information flow control 
across the application and database. Our approach reduces 
the amount of policy code required, yields formal guarantees 
across the application and database, works with existing rela¬ 
tional database implementations, and scales for realistic appli¬ 
cations. In this paper, we present a programming model that 
factors out information flow policies from application code 
and database queries, a dynamic semantics for the underlying 
yJDB lajjguage, and proofs of termination-insensitive 
non-interference and policy compliance for the semantics. 
We implement these ideas in Jacqueline, a Python web frame¬ 
work, and demonstrate feasibility through three application 
case studies; a course manager, a health record system, and 
a conference management system used to run an academic 
workshop. We show that in comparison to traditional applica¬ 
tions with hand-coded policy checks, Jacqueline applications 
have 1) a smaller trusted computing base, 2) fewer lines of 
policy code, and 2) reasonable, often negligible, overheads. 

Categories and Subject Descriptors D.3.3 [Programming 
Languages} : Language Constructs and Features 

General Terms Frameworks, Security 

Keywords Web frameworks, information flow 


1. Introduction 

From social networks to electronic health record systems, 
programs increasingly process sensitive data. As information 
leaks often arise from programmer error, a promising way to 
reduce leaks is to reduce opportunities for programmer error. 

A major challenge in securing web applications involves 
reasoning about the flow of sensitive data across the appli¬ 
cation and database. According to the OWASP report [42], 
errors frequently occur at component boundaries. Indeed, 
the difficulty of reasoning about how sensitive data flows 
through both application code and database queries has led 
to leaks in systems from the HotCRP conference manage¬ 
ment system [3] to the social networking site Facebook [47]. 
The patch for the recent HotCRP bug involves policy checks 
across application code and database queries. 

Information flow control is important to securing the 
application-database boundary [15, 18, 29, 42]. This is be¬ 
cause leaks often involve the results of computations on 
sensitive values, rather than sensitive values themselves. To 
reduce the opportunity for inadvertent leaks, we present a 
policy-agnostic approach [7, 48]. Using this approach, the 
programmer factors out the implementation of information 
flow policies from application code and database queries. The 
system manages the policies, removing the need to trust the 
remaining code. The program thus specifies each policy once, 
rather than as repeated intertwined checks across the program. 
Because of this, policy-agnostic programs require less policy 
code. We illustrate these differences in Figure 1. 

Supporting policy-agnostic programming for web appli¬ 
cations requires the framework to enforce information flow 
policies across the application and database. As we also show 
in Figure 1, a standard web program runs using an application 
runtime and a database. An object-relational mapping (ORM) 
to mediate interactions between the two. Our web framework 
uses a policy-agnostic application runtime and a specialized 




Figure 1. Application architecture in a standard web server 
compared to a policy-agnostic web server. 


ORM that mediates interactions between policy-agnostic ap¬ 
plication code and policy-agnostic database queries. 

There are three main parts to our solution: 1) supporting 
policy-agnostic database queries, 2) providing formal guaran¬ 
tees across the application and database, and 3) addressing 
issues of practical feasibility. We extend prior work on the 
Jeeves programming language [7, 48] that defines a policy- 
agnostic semantics for a simple imperative language. As is 
common with language-based approaches, Jeeves’s guaran¬ 
tees extend only within the Jeeves runtime. Interoperation 
with external databases is important as web applications rely 
on commodity databases for performance reasons. The chal¬ 
lenge is, then, to support policy-agnostic programming for 
database queries in a way that leverages existing database 
implementations while providing strong guarantees. 

We present faceted databases for supporting policy- 
agnostic database queries. The Jeeves runtime performs 
different computations based on the permissions of the user 
viewing the output. Because the viewer may not be known 
in advance, the runtime uses faceted execution to simulate 
simultaneous executions. A faceted value is the runtime 
representation of a value that may differ across executions. 
Semantically, a faceted database stores faceted values and 
performs faceted query execution. We show how to use a 
faceted object-relational mapping (FORM) to embed faceted 
values using relational databases and, surprisingly, to support 
faceted query execution simply by manipulating meta-data. 
The FORM manages complex dependencies, allowing a pol¬ 
icy to query the data it protects. 

Next we show that interoperation with faceted databases 
yields strong guarantees. We extend Jeeves’s core language 
with relational operators to create the core language. 
We present a dynamic faceted execution semantics for 
and prove termination-insensitive non-interference and pol¬ 
icy compliance. The formalization corresponds closely to an 


implementation strategy using existing database implementa¬ 
tions while yielding concise proofs. 

Towards supporting realistic applications, we formulate 
an “Early Pruning” optimization. While simulating multiple 
executions is desirable for reasoning, exploring multiple 
executions can be expensive in practice. The Early Pruning 
optimization allows the program to use program assumptions 
to safely explore fewer executions. This optimization is 
particularly useful for web applications, where it is often 
possible to use the session user to predict the viewer. With 
Early Pruning, performance may even be better than with 
hand-coded checks, as the runtime may now check policies 
once rather than repeatedly throughout execution. 

Einally, we demonstrate practically feasibility. We present 
Jacqueline, a web framework based on Python’s Django [1] 
framework. We use Jacqueline to build several application 
case studies, including a conference management system that 
we have deployed to run an academic workshop. The case 
studies show that using Jacqueline, policies are localized and 
the size of the policy code is smaller. Consequently, security 
audits can focus on the localized policy specifications rather 
than having to review the entire code base. We also demon¬ 
strate that Jacqueline has reasonable, often negligible, over¬ 
heads. Eor one case, the Jacqueline implementation performs 
better than an implementation with hand-coded policies. 

In summary, we make the following contributions: 


• Policy-agnostic web programming. We present an ap¬ 
proach that allows programmers to factor out information 
flow policies from the rest of web programs and rely on a 
web framework to dynamically enforce the policies. 

• Faceted databases. We present faceted databases to 
support policy-agnostic relational database queries. We 
present a faceted object-relational mapping (FORM) strat¬ 
egy for implementing faceted databases using existing 
relational database implementations. 

• Faceted execution for database-backed applications. 

We show interoperation of faceted databases with faceted 
application runtimes by presenting a dynamic seman¬ 
tics for the A'^^^ core language and proving termination- 
insensitive non-interference and policy compliance. 

• Early Pruning optimization. We address performance 
issues by formalizing an optimization, proving that it 
preserves policy compliance, and demonstrating that it 
significantly decreases overheads. 

• Demonstration of practical feasibility. We present the 
Jacqueline web framework and demonstrate expressive¬ 
ness and performance through several application case 
studies. We compare against hand-implemented policies, 
showing that not only does Jacqueline reduce lines of pol¬ 
icy code, but also that policy enforcement has reasonable, 
often negligible, overheads. 






Our approach decreases the opportunity for programmer error, 
provides strong formal guarantees, and is practically feasible. 

2. Introductory Example 

Using our policy-agnostic web framework, the programmer 
implements each information flow policy once, associated 
with the data schemas, as opposed to repeatedly across the 
code base. We designed Jacqueline so that programming with 
it is as similar as possible to programming with Django. In 
Jacqueline, the application runtime and object-relational map¬ 
ping dynamically manipulate sensitive values and policies so 
the programmer may omit repeated checks. 

Consider a social calendar application. Suppose Alice and 
Bob want to plan a surprise party for Carol, 7pm next Tuesday 
at Schloss Dagstuhl. They should be able to create an event 
such that information is visible only to guests. Carol should 
see that she has an event 7pm next Tuesday, but not that it is 
a party. Everyone else may see that there is a private event at 
Schloss Dagstuhl, but not event details. 

We demonstrate how to implement this example using 
Jacqueline, our new web framework based on Django [1], a 
model-view-controller framework. In a standard MVC frame¬ 
work, the model describes the data, the view describes fron- 
tend page rendering, and the controller implements other func¬ 
tionality. An object-relational mapping (ORM) supports a uni¬ 
form object representation. In Jacqueline, the model addition¬ 
ally specifies information flow policies. The faceted object- 
relational mapping (FORM) additionally supports a uniform 
representation of sensitive values and policies. Jacqueline is 
policy-agnostic: other than the policies, a Jacqueline program 
looks like a policy-free Django program. 

The division of labor between the programmer and the 
framework is as follows. The programmer associates infor¬ 
mation flow policies with fields in fhe dafa schema, codes 
within the subset of Python supported by our Jeeves library, 
and accesses the database only through the Jacqueline API. 
The framework tracks sensitive values and policies between 
the application and database to produce outputs that adhere 
to the policies. In our attack model, the user is untrusted and 
we assume the programmer is not malicious. 

We intend for this example to explain the semantics of 
policy-agnostic web programming. We discuss issues of 
implementation and optimization issues in later sections. 

2.1 Schemas and Policies in Jacqueline 

In Jacqueline’s policy-agnostic programming model, pro¬ 
grammers are responsible for specifying information flow 
policies and the application runtime and object-relational 
mapping are responsible for tracking the flow of sensitive 
values to produce outputs adhering to those policies. Pro¬ 
grammers specify each information flow policy once, associ¬ 
ated with the data schema in the model. We show a sample 
schema for the Event and EventGuest data objects in Fig¬ 
ure 2. A Jacqueline schema defines held names, held types. 


1 class Event ( JModel ): 

2 nanne = C h a r Fi eI d ( nn ax_length =256) 

3 location = C h a r F i e I d ( m ax_lengt h =512) 

4 time = DateTimeFieId () 

5 description = C h a r F i e I d ( m ax_lengt h = 1024) 

6 

7 # Public value for name field. 

8 Ostaticmethod 

9 def j acq ueI i ne_get_p u b I ic_n a me ( even t ): 

10 return "Private event" 

11 

12 # Public value for location field . 

13 Ostaticmethod 

14 def jacqueline_get_public_location (event): 

15 return "Undisclosed location" 

16 

17 # Policies for name and location fields. 

18 Ostaticmethod 

19 Olabel_for( 'name' , ’ location ’) 

20 Ojacqueline 

21 def j a c q u e I i n e_ r e s t r i c t _e V e n t ( e ve n t , ctxt); 

22 return ( EventGuest . objects . get ( 

23 event=self , guest=ctxt) 1= None) 

24 

25 class EventGuest ( JModel ): 

26 event = Foreign Key ( Event) 

27 guest = Fo reig n Key ( U se r P r o f i I e ) 


Figure 2. Jacqueline schema fragment for calendar events. 

and optional policies. We define the Event class with fields 
name, location, time, and description. Up to line 5, this looks 
like a standard Django schema definition. 

2.1.1 Secret Values and Public Values 

A sensitive value in Jacqueline encapsulates a secret (high- 
confldentiality) view available only to viewers with sufficient 
permissions and a public (low-confldentiality) view available 
to other viewers. Jacqueline allows sensitive values to behave 
as either the secret value or public value, depending on 
viewing context (i.e. the user viewing a page). 

The actual held value is the secret view and the program¬ 
mer must additionally define a method computing the public 
view. On line 9 we define the jacqueline_get_public_name 
method computing the public view of the name held. If the 
permissions prohibit a viewer from seeing the sensitive name 
held, then the name held will behave as "Private event" 
throughout all computations, including database queries. This 
function takes the current row object (event) as an argu¬ 
ment, allowing public values to be computed using row 
fields. The Jacqueline ORM uses naming conventions (i.e. the 
jacqueline_get_public prefix) fo And fhe appropriafe mefh- 
ods to compute public views. 

2.1.2 Specifying Policies 

In Jacqueline, programmer-specified information flow poli¬ 
cies guard the flow of sensitive values. On line 21 we imple¬ 
ment the policy for the fields name and location, as indicated 
by the label_for decorator. The policy is a method that takes 
two arguments, the current row object (event) and the viewer 



(ctxt) corresponding to the user looking at a page. Our policy 
queries the EventGuest table (line 25) to determine whether 
the viewer is associated with the event. 

Without Jacqueline, the programmer would need to imple¬ 
ment an equivalent function and call it whenever the location 
value is used. Using Jacqueline, the program no longer needs 
to explicitly perform these policy checks because Jacque¬ 
line’s ORM and application runtime ensure that the policy is 
enforced. Jacqueline handles mutable state by enforcing this 
policy with respect to the value of event at the time a value is 
created and the state of the system at the time of output. 

2.2 Faceted Execution 

Jacqueline uses an enhanced application runtime that keeps 
track of the secret and public views of sensitive values and 
results of computations on sensitive values. Once the pro¬ 
grammer associates policies with sensitive data fields, the 
rest of the program may be policy-agnostic. We call create 
in Jacqueline the same way as in Django: 

carolParty = E ven t . o b j ec t s . c re a t e ( 
name = "Carol 's surprise party" 

, location = "Schloss Dagstuhl", ■■■) 

To manage the policies, the Jacqueline FORM creates faceted 
values for the sensitive fields. For the name fields, the frame¬ 
work creates the faceted value (k ? "Carol’s surprise party" : 
"Private event"), where k is a fresh Boolean label guarding 
the secret actual field value and the public facet computed 
from the get_public_name method. The runtime eventually 
assigns label values based on policies and the viewer. We 
describe in Section 3 how the FORM stores faceted values in 
a relational database. 

The runtime evaluates faceted values by evaluating each of 
the facets. Evaluating "Alice’s events: " -|- str(alice.events) 
yields the resulting faceted value guarded by the same label: 

(k ? "Alice’s events: Carol’s surprise party" 
"Alice’s events: Private event") 

Guests of the event will see "Carol’s surprise party" as part 
of the list of Alice’s events, while others will see only 
"Private event". Faceted execution propagates labels through 
all derived values, conditionals, and variable assignments to 
prevent indirect and implicit flows. 

Jacqueline performs faceted execution for database queries, 
preventing indirect flows through queries like the following: 

Event, objects, filter ( 

I ocation="SchIOSS Dagstuhl") 

If carolParty is the only event in the database, faceted execu¬ 
tion of the filter query yields a faceted list (m ? [carolParty] : 
[]). Viewers who should not be able to see the location field 
will not be able to see values derived from the sensitive field. 

Jacqueline also prevents implicit leaks through writes to 
the database. For instance, consider this code that replaces 
the description field of Event rows with "Dagstuhl event!" 
when the location field is "Schloss Dagstuhl": 


for loc in Event, objects, all (): 

if loc. location ^ "Schloss Dagstuhl": 

I oc . d e s c r i p t i o n = "Dagstuhl eventi" 
save(loc ) 

For carolParty the condition evaluates to (k ? True : False). 
The runtime records the influence of k when evaluating the 
conditional so that the call to save writes (k ? carol Party New : 
carolParty), where carolPartyNew is the updated value. 

2.3 Computing Concrete Views 

Computation sinks such as print take an additional ar¬ 
gument corresponding to the viewer and resolves poli¬ 
cies according to the viewer and policies. For instance, 
print carolParty.name displays "Carol’s surprise party" to 
some viewers and "Private event" to others. The program¬ 
mer does not need to designate the viewer: it can be an 
implicit parameter set from authorization information. 

The policies and viewer define a system of constraints for 
determining label values. Printing carolParty.name to alice 
corresponding to the following constraint: 
k ^ 

( EventGuest. objects.get( 

event=self , guest=ctxt) 1= None) 

To account for dependencies on mutable state, the runtime 
evaluates this constraint in terms of the guest list at the time of 
output. Labels are the only free variables in the fully evaluated 
constraints. There is always a consistent assignment to the 
labels: assigning all labels to False is always valid. 

The constraint semantics allows Jacqueline to handle 
mutual dependencies between policies and sensitive values. 
Suppose that the guest list policy depended on the list itself: 

@label_for(’guest’) 

def J a c q u e I i n e_ r e s t r i c t _g u e s t ( e ve n t g u est , ctxt): 
return ( EventG uest . objects . get ( 

event=eventguest.e, guest=ctxt) 1= None) 

The policy requires that there must be an entry in the 
EventGuest table where the guest field is the viewer ctxt, so 
the policy for the guest field depends on the value of the field 
itself. There are two valid outcomes for a viewer who has 
access: either the system shows empty fields or the system 
shows the actual fields. Jacqueline always attempts to show 
values unless policies require otherwise. Note that unless 
there are mutual dependencies, Jacqueline may determine 
label values by evaluating policies directly. 

Such circular dependencies are increasingly common in 
real-world applications. Consider, for instance, the following 
policies: a viewer must be within some radius of a secret 
location to see the location; a viewer must be a member of a 
secret list to see the list. Unfortunately, it is common practice 
to execute such policies in a trusted “omniscient” context that 
risks leaking information. 

3. The Faceted Object-Relational Mapping 

Our faceted object-relational mapping (FORM) 1) uses meta¬ 
data to represent faceted values and 2) manages queries by 



manipulating meta-data and marshalling to and from the 
database representation. Surprisingly, our solution allows 
us to use existing relational database implementations for 
creating, updating, selecting, joining, and sorting records. 
In this section, we introduce the faceted object-relational 
mapping (FORM) using SQL syntax and present the Early 
Pmning optimization. 

3.1 Executing Relational Queries with Facets 

A faceted row is a faceted value containing leaves that 
are non-faceted relational records. Any record containing 
faceted values may be rewritten to be of this form. We map 
each faceted row to multiple database rows by augmenting 
records with meta-data columns corresponding to 1) a unique 
identiher jid and 2) an identiher jvars describing which facet 
the row corresponds to, for instance "kl=True,k2=True". 

The FORM is responsible for marshalling between the 
database and rantime representations of faceted values. The 
FORM stores the faceted value (k ? "Carol’s surprise party": 
"Private event") as two rows in the Event table with the 
same jid of 1. The secret facet has a jvars value of "k=True" 
and the public facet has a jvars value of "k= False". For nested 
facets, we store more labels in the jvars column, for instance 
"kl=True,k2=True". In Table 1 we show how this faceted 
value would look in an augmented table. 

3.1.1 Queries That Track Sensitive Values 

A key advantage of our representation is that the FORM 
can issue standard relational queries not only for selections 
and projections, but also joins and sorts. Storing each facet 
in a different row allows the FORM to rely on the correct 
marshalling of query results for preventing indirect flows 
through queries. Note that the FORM would not be able to 
issue relational queries in such a straightforward way, for 
instance, if it stored each faceted value in the same row, or if 
it stored different facets in different databases. 

Consider the query SELECT * from Event WHERE 
location = "Schloss Dagstuhl" on the rows from Table 1. 
Issuing the query directly on the augmented database will 
return the one matching row with jid=l and jvars="k=True". 
Reconstructing the facet structure yields a faceted value 
guarded by label k with a collection containing the record 
in the secret facet and an empty collection in the other facet. 
Relying on unmarshalling is sufficient for faceted execution. 

Surprisingly, rows from joins that occur based on sensitive 
values will also be appropriately guarded by the appropri¬ 
ate path conditions. The only additional considerations the 
FORM needs to make for joins are to 1) take into account the 
jvars fields from both tables and 2) ensure that foreign keys 
(references into another table) use jid rather than the primary 
key. In Table 2, we show an example where the WHERE 
clause hlters on the results of a JOIN. In the ON clause, we 
use the jid rather than id. In the SELECT clause, we include 
the User.jvars as well as the EventGuest.jvars held. 


A particularly nice consequence of storing each facet 
in different rows is that the FORM can take advantage of 
SQL’s ORDER BY functionality for sorting. Suppose we 
had faceted records, each with a single held f, with values 
(a ? "Charlie": "***"), (b ? "Bob": "***"), and (c ? "Alice"; 
"***"). The FORM can use the standard sorting procedure 
without leaking information because the secret values are 
stored in different rows from the public values. Correct 
unmarshalling will enforce the policies so that, for instance, 
an output context with the permitted labels {a,^b, c} would 
see ["***", "Alice", "Charlie"]. 

A limitation is that the FORM cannot use existing rela¬ 
tional implementations for aggregation, for instance counting 
or summing. Using aggregate queries directly could leak in¬ 
formation because without looking at the path conditions, 
these aggregates would combine values across facets. This 
does not suggest a fundamental limitation. Applications often 
prematerialize aggregates, making it reasonable to use the 
faceted runtime to precompute aggregates. Otherwise, sup¬ 
porting faceted aggregation at scale is a matter of optimizing 
the procedures, perhaps as database user-dehned functions. 

3.1.2 Creating and Updating Data and Policies 

The FORM creates tables and rows with the appropriate meta¬ 
data to keep track of facets. The FORM prevents implicit 
leaks through updates by updating meta-data appropriately 
and potentially deleting rows. Invoking save in branches 
that depend on faceted values creates facets that incorporate 
the path conditions. To add policies, the programmer needs 
to manipulate only the meta-data columns (jvars and jid). 
Adding policies to legacy data involves adding meta-data 
columns. Updating policies using existing labels simply 
involves updating policy code. 

3.2 Early Pruning Optimization 

An important correctness-preserving optimization is to prune 
facets once the runtime knows the viewer. This involves being 
able to determine 1) the viewing context and 2) that policy¬ 
relevant state relevant will not change before output. Two 
properties of web programs make this analysis simple. First, 
the session user is often the viewing context. Second, com¬ 
putation sinks are easy to identify in model-view-controller 
frameworks: most functions either read from the database or 
write to the database, but not both. This makes it advanta¬ 
geous for the framework to speculate on the viewer for “get” 
requests. We formalize Early Pmning in Section 4.4. 

3.3 Data Representation Considerations 

It is also important to discuss whether storing faceted val¬ 
ues in the database may be prohibitively expensive. There 
are many ways to avoid storing too much data in practice. 
Work on multi-level databases [21, 30] suggests it is both 
useful and practically feasible to store multiple versions of 
data corresponding to different access levels. The question 
becomes, then, how to avoid storing too much data due to too 



id name location jid jvars 

1 "Carol’s ... party" "Schloss Dagstuhl" 1 "x=True" 

2 "Private event" "Undisclosed location" 1 "x=False" 


Table 1. Example table. 


Django Query 


Jacqueline Query 


EventGuest.objects.filteriguest_name=" Alice") 


SELECT EventGuest.event, EventGuest.guest 
FROM EventGuest 
JOIN UserProfile 

ON EventGuest . guest_id = U se r P rof i I e . id 
WHERE U serProfile . name= ’ Alice ’ ; 


SELECT EventGuest . event , EventG uest . guest , 

Even tG uest . j i d , EventG uest . j va rs , 
UserProfile . jvars 
FROM EventGuest 
JOIN U serP rofi le 

ON EventGuest . guest_id = U se r P rof i I e . J i d 
WHERE U serProfile . name= 'Alice ’ ; 


Table 2. Translated ORM queries in Django vs. Jacqueline. 


many possible path conditions. An important optimization 
involves combining values that are the same to a single view. 
In Section 4, we dehne an optimization to allow sharing rows 
that different facets have in common. 

4. Formal Semantics and Policy Compliance 

We model the faceted object relational mapping with the 
idealized core language called . We prove that 
satishes termination-insensitive non-interference and policy 
compliance across the application and database. 

4.1 Syntax and Formal Semantics 

The language X'^^ extends the language [7] with 

support for databases, which we model as relational tables. 
Figure 3 summarizes the X’^^ syntax, with the constructs 
from marked in gray. The language, in turn, 

extends the standard imperative X-calculus with constructs for 
declaring new labels (label k in e), for imperatively attaching 
policies to labels (restrict(A:,e)), and for creating faceted 
values {{k ? en '■ cz,)). This last expression behaves like en 
from the perspective of any principal authorized to see data 
with label k and cl for all other principals. Note that X^^^ 
does not include imperative updates to tables, but we can 
model updates by introducing a layer of indirection where we 
access tables via references and updating a table corresponds 
to replacing the contents of the appropriate reference. 

The language X^^^ extends with support for 

databases, where each table is a (possibly empty) sequence 
of rows and each row is a sequence of strings. We require 
that all rows in a table have the same size. To manipulate 
tables, X^^^ includes the usual operators of the relational 
calculus: selection (Oi=j e), which selects the rows in a 
table where fields i and j are identical, projection (Kj e), 
which returns a new table containing columns i from the 
table e, cross-product {ei cxi e2), which returns all possible 
combinations of rows from ei and e2, and union {ei Ue2), 
which appends two tables. The construct row e creates a 



Term 

X 

variable 

c 

constant 

Xx.e 

abstraction 

e\ e 2 

application 

ref e 

reference allocation 

\e 

dereference 

e\:=e2 

assignment 

{kl ch: Cl) 

faceted expression 

label kine 

label declaration 

restrict(A:,e) 

policy specification 

row e 

create a table 

Oi=j e 

select rows where i = j 

e 

project columns 

e\ ixi e 2 

join or cross-product of tables 

ei Ue2 

union of tables 

fold Cf Cp Ct 

table fold 


Statement 

let X = e in 5 

let statement 

print {gy} er 

print statement 


Constant 

f 

file handle 

b 

boolean 

i 

integer 

s 

string 


Variable 


Label 


Figure 3. X'^^^ syntax. 



new single-row table. The fold operation fold ef Cp et snp- 
ports iterating, or folding, over tables. Fold has the “type” 
VA,B.(B ^ A ^ B) ^ ^ table AB. 


4.2 Formal Semantics 


We formalize the big-step semantics as the relation E, e IJ-pc 
E',y, denoting that expression e and store E evalnate to V, 
prodncing a new store E'. The program connter pc is a set 
of branches. Each branch is either a label A: or a negated 
label ^k. Association with k means the computation is visible 
only to principals authorized to see k and association with ^k 
visibility only to principals not authorized to see k. 

We chose our representation of faceted databases to be 
faithful to realistic implementation strategies. We could rep¬ 
resent faceted tables as {k ? table Ti : table T 2 ), bnt this ap¬ 
proach would incur signihcant space overhead, as it requires 
storing two copies of possibly large database tables, possibly 
with only small differences between the two tables. Instead, 
we nse the more efficient approach of faceted rows, where 
each row {B,s) in the database inclndes a set of branches B 
describing who can see that row. For example, the expression 
{k ? row "Alice" "Smith" : row "Bob" "Jones") evalnates to 
the following table *; 

({/t}, ("Alice", "Smith")) 

({-/t}, ("Bob", "Jones")) 

Note that we do not model the facet identiher row jid, as it is 
not necessary for the formal semantics or proof. 

To accommodate both faceted valnes and faceted tables, 
we define fhe partial operation ((•?•:•)) fo create eifher a 
new faceted value or a table with internal branches on rows: 

((•?•:•)) : Label x Val x Val —> Val 

{{klFn-.Ft)) = {klFn-.Ft) 


((k ? table Th : table Tl )) = table T 

where T = {{B,s) \ {B,s) € 7//n 7i}U 

{(BU{A:},S) I {B,s)eTH\TL,^k^B}U 
{iB(j{^k},s)\iB,s)eTL\TH,k^B} 

Wrapping a facet with label k aronnd non-table valnes Fh 
and Fi simply creates a faceted valne containing k, Fh, and 
Wrapping a facet with label k aronnd tables Th and 
creates a new table T containing the rows from Th and Tl, 
annotated with k and -^k respectively, with an optimization 
to share the rows that Th and Tl have in common. We extend 
this operator to sets of branches: 

((•?•:•)) : Branches x Val x Val —> Val 

((0?V//:Vl)) = Vh 

{{{k}VB7VH:VL)) = {{kl {{B1 Vh:Vl)):Vl)) 


{{{^k)UBlVH:VL)) = {{klVL:{{BlVH:VL}})} 

We show the faceted evalnation rnles in Fignres 4 and 5. 
The key rnle is [f-split], describing how evalnation of a 
faceted expression {kl e\ : efj involves evalnating the snb- 


’ Note that this value representation does not support mixed expressions such 
as (/: ? 3 : row "Alice"), which mix integers and tables in the same faceted 
values. Programs that try to unnaturally mix values will get stuck. 


expressions in seqnence. Evalnation adds k to the program 
connter to evalnate e\ and -^k to evalnate e^ and then joins 
the results in the operation {{klV\ '.Vi)). The rules [f-left] 
and [F-RiGHT] show that only one expression is evalnated if 
the program counter already contains either k or -^k. 

Our rules use contexts to describe faceted execution. The 
rule [F-CTXT] for E [e] enables evaluation of a subexpression 
inside an evalnation context. We nse S to range over strict 
operator contexts, operations that require a non-faceted value. 

If an expression in a strict context yields a faceted valne 
{k 1 Vh ■ Vl), then the rnle [f-strict] applies the strict 
operator to each of Vh and Vl- Eor example, the evalnation 
of {k 1 f : g)(4) rednces to the evalnation of {k 1 f{4) : 
g(4)), where S in this case is • (4). The rnles [f-select], 
[F-SELECT], [F-PROJ], [f-join], and [F-UNiON] formalize the 
relational calculus operators on tables of faceted rows. 

The rnles for folding are more interesting. If a row (B,s) 
is inconsistent with (i.e., not visible to) the current program 
connter label pc, then rnle [f-fold-inconsistent] ignores 
that row. If the row is consistent, then mle [f-fold-consistent] 
applies the fold operator Vf to the row contents s and the 
accumulator V\ producing a new accumulator V". The result 
of that fold step is ((B ? V" '-V')), a faceted expression that 
appears like V" to principals that can see the B-labeled row 
and like V' to other principals. 

The faceted execution semantics describe the propagation 
of labels and facets for the pnrpose of complying with poli¬ 
cies at compntation sinks. expressions do not perform 
I/O, while statements inclnde the effectfnl constrnct 
print {gy} Cr that prints expression Cy under the policies and 
viewing context gy. We provide the mles for declaring 

labels, attaching policies, and assigning labels for printing 
in Appendix A. The ®label_for and Jacqueline_restrict con¬ 
structs correspond to thhe [e-label] and [f-restrict] rules. 


4.3 Application-Database Policy Compliance 

yeeves j-yj properties that 1) a single faceted execution 

is equivalent to multiple different executions without faceted 
values and 2) the system cannot leak sensitive information 
through the output or the choice of output channel. We prove 
that the properties extend to . 

The proof involves extending the projection property of 
yeeves. ^ a single execntion with faceted valnes projects to 
multiple different executions without faceted values. To prove 
this property, we first define whaf if means fo be a view and 
fo be visible. A view L is a sef of principals. B is visible to 
view L (written B ~ L) if Vk € B.k € L and € B.k ^ L. 



Runtime Syntax 


e 

e 

Expr 

::= ... 1 a 1 table T 

E 

e 

Store 

= {Addr —^p Val) U {Fabel —>■ Val) 

R 

e 

RawValue 

::= c \ a \ (kx.e) 

a 

e 

Address 


F 

e 

FacetedValue 

::= R 1 (k?£i :£2) 

T 

e 

Table 

= {Branches x String'')* 

V 

e 

Val 

::= F 1 table T 

b 

e 

Branch 

::= k 1 ^k 

pc,B 

e 

Branches 

::= b* 


Evaluation Contexts 


E ::= {klE:e)\ {klv.E) 



1 • e 1 V 

• 1 ref • 

1 ! • 1 • :=e 


1 V:= . 1 

row y ... • 

e... 1 0 ,=; • 1 %J 


1 • CXI e 1 

yixi • 1 

• Ue yu • 


1 fold •eel fold V 

• e 1 fold y y • 

Strict Contexts 



5 

::= • e ! 

> 

II 

• 

• 

1 0,=; • 1 7t; • 


1 • cxy 1 

table T ixi 

• 1 • uy 


1 table T U 

• 1 row y 

... • e... 


fold y y 

• 



Expression Evaluation Rules for Subset 


2:',y 


E,y 

a<^dom{Y.) E'=E[a:=((pc?y :0))] 
E,ref y -Upc 

fl ^ dom(L) 
I., la]}-pc I.,0 

a G dom(JL) 

E, !a-|J.peE,E(a) 

E^ = E[a:=((pc?y:E(a))}] 

E,fl:=y 


[F-VAL] 


[F-REF] 


[F-DEREF-NULL] 


[F-DEREF] 


[F-ASSIGN] 


£ 7 ^ [] e not a value 

E,eVE',y' E^£[y'] VE",y" 

E,£[e] 


[F-CTXT] 


E,e[x:=y] -ilpc E',y' 
E,(Ajc.e) y -Upe E',y' 


k ^ pc -^k ^ pc 
•lJ'peU{/:} ^Ijyi 
^1)^2 •lJ'peU{-'A:} ^ )^2 
V' = ((k7Vi:V2}} 


E,(k? 

Cl : e2) jj-pc 

E',y' 

k€pc 

E,ei jipclE',y 

E,(k? 

Cl : e2) jj-pe 

E',y 

^k G pc 

E,e2 ]}-pc^',V 

E, (k ? ei : e 2 ) V ' 


5 :, {k ? 5[y/r] 

: 5[yL]) ^p. 

E',y' 


E,5[(fc?y//:yL)] V r',y' 


[F-SPLIT] 


[F-LEFT] 


[F-RIGHT] 


[F-STRICT] 


Figure 4. Faceted evaluation of without relational operators. 


We extend views to values: 

L : ya/(with facets) —> yaZ(without facets) 


L{R)=R 


L{{klFr.F2)) 


L{Fi) k&L 
L{F2) k^L 


L(table T) = {{(d,s) \ {B,s) GT,B visible toL} 


We then prove the Projection Theorem. The full proof is in 
Appendix E. Proofs of the key lemmas are in Appendices B 
and C. 

Theorem 1 (Projection). Suppose E,e jj-pe E',y. Then for 
any view Lfor which pc is visible, 

L{Z),L{e) UL{T!)My) 


We extend views to expressions: 

L((k?ei :e2)) = | 


L(ei) 

L{e2) 


keL 

k^L 


For all other expression types we recursively apply the view 
to subexpressions. 


The Projection Theorem allows us to extend prop¬ 

erty of termination-insensitive non-interference. To state the 
theorem we first define two faceted values to be L-equivalent 
if they have identical values for the view L. This notion of 
L-equivalence naturally extends to stores (Ei ~pc E 2 ) and 
expressions {e\ ^pc £ 2 ). The theorem is as follows: 






E,row s-Upc E, (table (e,s)) 

T' = {(B,si ...Sn) e T I Si = Sj} 


[F-ROW] 


E, (table Ti ) U (table T 2 ) ^pc E, (table T 1 .T 2 ) 


[F-UNION] 


, , , [F-SELECT] - ^——! --7--^- - [F-PROJECT] 

E,Oj=j (table T) IJ-pc E, (table T ) E,;!; (table T) ij-pc E, (table T ) 


73 = {{BiUB 2 ,si...Sms'i...s'„) I (fii,^i...^m) € Ti,{B 2 ,s[.. .s'„) e T 2 } 
E, (table Ti) txi (table T 2 ) IJ-pe E, (table 73 ) 


E, fold Vf Vp (table e) IJ-pe E, Vp 

Ejfold Vf Vp (table T) IJ-pe E',y' B inconsistent with pc 
E,fold Vf Vp (table {B,s).T) V E',y' 

E,fold Vf Vp (table T) IJ-pc E',y' B consistent with pc V/ s V' -IJ-pcuB E",y" 

E,fold Vf Vp (table {B,s).T) V {{B ? V : V')) 


[F-JOIN] 

[F-FOLD-EMPTY] 

[F-FOLD-INCONSISTENT] 

[F-FOLD-CONSISTENT] 


Figure 5. Faceted evaluation with relational operators. 


Theorem 2 (Termination-Insensitive Non-Interference). 

Let L be any view. Suppose Ei E 2 and ei e 2 , and that: 
Ei,ei U-o E'j,yi E2,e2'IJ'0 ^25^2 
then E'j Ej and Vi y 2 . 

The Termination-Insensitive Non-Interference Theorem al¬ 
lows us to extend the termination-insensitive policy compli¬ 
ance theorem of [7]; data is revealed to an external 

observer only if it is allowed by the policies specified. 


4.4 Early Pruning 

The Early Pruning optimization involves shrinking a table T 
by keeping each row (7?, s) only when B is consistent with the 
viewer constraint described by pc. We show the rule below: 
E,e U-pc E', (table T) 

T' = UB,s) € T \ B consistent with pel 

--- , , - [F-PRUNE] 

E,e U-pc E , (table T ) 

We prove the Projection Theorem holds with this extension. 


4.5 Policy Dependencies on Sensitive Values 

Policies on a label may contain sensitive values that depend 
on the label. The semantics handles this situation by 
propagating labels through all computations and then assign¬ 
ing label values according to the [f-print] rule from 
(Appendix A). Our theorem modifies the traditional notion 
of non-interference to accommodate these dependencies. In 
the statement of Theorem 2, the resulting environments need 
only be L-equivalent when the viewer does not have access 
to the high-confidentiality view. If a viewer does not have 
access, then the sensitive value should be indistinguishable 
from any other value the viewer does not have access to. 


5. Implementation 

While previous implementations of Jeeves [7, 48] use Scala, 
we implement Jacqueline in Python, as an extension of 
Django [1], because of the popularity of both for web pro¬ 
gramming. Our code is available at https : //github. com/ 
jeanqasaur/jeeves. 

5.1 Python Embedding of the Jeeves Runtime 

We implemented Jeeves as a library that dynamically rewrites 
code to behave according to the semantics. The library 

exports functions for creating labels, creating sensitive values, 
attaching policies, and using policies to show values. Our 
implementation supports a subset of Python’s syntax that 
includes if-statements, for-loops, and return statements. 

5.1.1 Faceted Execution 

To support faceted execution, the implementation defines 
a Facet dafa fype for primitives and objeefs where fhe 
facefs may fhemselves be facefed. A value may exisf only 
in some execution pafhs, in which case we use a special 
objeef Unassigned() for ofher pafhs. To perform facefed 
execution, the implementation uses operator overloading 
and dynamic source transformation via the macro library 
MacroPy [4]. The source transformation intercepts evalua¬ 
tion of conditionals, loops, assignments, and function calls. 
The implementation handles local assignment by replacing a 
function’s local scope with a Namespace object determining 
scope. To prevent implicit flows, fhe runtime keeps track of 
path conditions to index state updates, database writes, and 
policy declarations. 

5.1.2 Evaluating Policies at Computation Sinks 

The runtime maps labels to policies. If there are no mutual de¬ 
pendencies between policies and sensitive values, the runtime 



evaluates policies to determine label values. Otherwise, the 
mntime produces an ordering over Boolean label assignments 
and uses the SAT subset of the Z3 SMT solver [32] to find a 
satisfying assignment. 

5.2 Jacqueline Implementation 

We extend Django’s functionality by “monkey-patching,” in¬ 
heriting from Django’s classes and overloading the meth¬ 
ods of the FORM. The FORM is responsible for 1) mar¬ 
shalling between faceted representations in the application 
and database and 2) managing the meta-data to track facets 
in the database. To represent faceted values, the FORM cre¬ 
ates schemas with additional meta-data columns. The FORM 
reconstructs facets from the meta-data by looking up poli¬ 
cies from object schemas and adding them to the runtime 
environment. We implement the Early Praning optimization 
by reconstructing only the relevant facets when the runtime 
knows the viewer. FORM queries manipulate the meta-data 
columns in addition to the actual columns. Programmers may 
access the database only through the supported API. 


6. Jacqueline in Practice 

We built 1) a conference management system, 2) a health 
record manager, and 3) a course management system to 
evaluate Jacqueline along the following dimensions: 

• Code architecture. We compare the implementation of 
the Jacqueline conference management system to an 
implementation with hand-coded policies in Django. We 
demonstrate that Jacqueline helps with both centralizing 
policies and with size of policy code. 

• Performance. We show that for representative actions, 
Jacqueline has comparable—and, in one case, better— 
performance compared to Django. For the stress tests, 
the Jacqueline programs often have close to zero overhead 
and at most a 1.75x slowdown compared to vanilla Django. 
We also demonstrate the effectiveness of and necessity of 
the Early Pruning optimization. 

While the conference management system has the most 
features, the applications have code and policies of similar 
complexity. We deployed our conference management system 
to run an academic workshop [44]. 

We worked with two undergraduate research assistants to 
implement the health record and course manager case studies 
to evaluate the usability of our programming model. Both 
students found the policy-agnostic approach to be “promising” 
and to hide complexity in implementing information flow 
policies. There were some objections about the boilerplate 
needed to use the Jeeves and Jacqueline libraries, as well 
as the acknowledgment that building these features into a 
language runtime directly would mitigate this issue. 


Lines of Policy Code: Jacqueline vs. Django 
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Figure 6. Distribution of policy code with Jacqueline and 
Django conference management systems. 


6.1 Case Study Applications 

Conference management system. We support user registra¬ 
tion, update of profile information, designation of roles (i.e. 
PC member), paper and review submission, and assignment 
of reviews. Permissions depend on the current stage of the 
conference: submission, review, or decision. 

Health record manager. We implemented a simple health 
record system based on a representative fragment of the pri¬ 
vacy standards described in the Health Insurance Portability 
and Accountability Act (HIPAA) [8, 35]. HIPAA describes 
how individuals, hospitals, and insurance companies may 
view a medical history depending on roles and stateful in¬ 
formation such as whether there exists a permission waiver. 
The case study manages health records and permissions when 
viewed by patients, doctors, and insurance companies. 
Course manager. Our tool allows instructors and students to 
organize assignments and submissions. Policies depend on 
the role of the viewer, as well as stateful information such as 
whether an assignment has been submitted. 

6.2 Code Comparisons 

We compare our Jacqueline implementation of a conference 
management system against a Django implementation of the 
same system. We demonstrate that 1) Jacqueline reduces the 
trusted computing base and 2) separating policies and other 
functionality decreases policy code size. 

6.2.1 Django Conference Management System 

We compare the lines of code in the Jacqueline and Django 
conference management systems in Figure 6. (Note that 
Jacqueline counts are bloated from the additional imports 
and function decorators required.) Jacqueline demonstrates 
advantages in both the distribution and size of policy code. 
In the Jacqueline implementation, policy code is confined to 
the models.py file describing the data schemas, while in the 
Django implementation, there are also policies throughout 
the controller file views.py. The Jacqueline implementation 
has 106 total lines of policy code, whereas the Django imple¬ 
mentation has 130 lines manifesting as repeated checks and 
filters across views.py. While the Django code requires audit¬ 
ing the 575 lines of models.py and views.py, the Jacqueline 




1 class Paper ( Model ): 

2 ... 

3 Ostaticmethod 

4 @I a be I_for( ’author ’ ) 

5 @jeeves 

6 def jeeves_ r estr i c t _author(paper, ctxt): 

7 if phase ^ 'final ’ : 

8 return T rue 

9 else : 

10 if paper ^ None: 

11 return False 

12 if P a pe r P C Co nf I i ct . o b j e c t s . get ( 

13 paper=paper, pc=ctxt) 1= None: 

14 return False 

15 return ((paper 1= None and 

16 p a pe r . a u t h o r ^ c txt) 

17 or (ctxt 1= None and 

18 (ctxt. level ^ 'chair' or 

19 ctxt. level ^ 'pc'))) 


Figure 7. Jacqueline schema and code fragments. 


Django Schema 

1 class Paper ( Model ): 

2 ... 

3 def poI i cy_author(seIf , ctxt): 

4 if phase ^ 'final ' : 

5 return T rue 

6 else : 

7 try : 

8 conflict = 

9 PaperPCConfIict . objects.get( 

10 paper=self , pc=ctxt) 

11 return False 

12 except : 

13 return ((self, author ^ ctxt) 

14 or ( ctxt 1= None and 

15 (ctxt. level ^ 'chair' or 

16 ctxt. level ^ ' pc ' ))) 

Python Code with Policy Checks 

1 def pa pers_view ( req uest ): 

2 papers = Paper, objects, all () 

3 for paper in papers: 

4 if not paper, p o I i c y _ p a p e r I a b e I ( u se r ): 

5 paper.author = None 

Figure 8 . Django schema and code fragments. 


code requires auditing only the 200 lines of models.py (-200 
lines of code), reducing the size of the application-specific 
trusted code base by 65%. 

We show a fragment of Jacqueline policy code in Fig¬ 
ure 7 and a fragment of the analogous Django policy code 
in Figure 8 , along with an example of how the policy check 
functions are called in the Django implementation. The policy 
functions are similar across the two implementations, with 
small differences from the fact that the Jacqueline API to the 
database does not raise an exception if an entry is missing. 
The main difference is that in the Django implementation, 
the programmer is responsible for calling these policy func- 
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Figure 9. Stress test times for our three case studies. We 
compare the conference management system times (a) to an 
implementation in Django. 


tions (as we show in Figure 8 ), whereas in the Jacqueline 
implementation the runtime is responsible for handling the 
interaction with policy functions. 

6.3 Performance Measurements 

We measured times using an Amazon EC2 m3.2xlarge in¬ 
stance running Ubuntu 14.04 with 30GB of memory, two 
80GB SSD drives, and eight virtual 64-bit Intel(R) Xeon(R) 
CPU E5-2670 v2 2.50Ghz processors. We use the FunkLoad 
testing framework [2] for HTTP requests across the network, 
excluding CSS and images. We average over 10 rapid se¬ 
quential requests. We test with sequential users because how 
well Jacqueline handles concurrent users compared to Django 
simply depends on the amount of available memory. 

We show 1) policy enforcement in Jacqueline has reason¬ 
able overheads, especially compared to Django and 2) Early 
Pruning is effective and often necessary. 

6.3.1 Stress Tests 

In Eigure 9 we show running times from our stress tests. 
Eor each application, we show an increasing number of 
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Django 
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0.172s 
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0.299s 
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16 
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0.542s 
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1.551s 
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0.510s 
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2.810s 

1.633s 

256 

0.769s 

0.820s 

512 

5.717s 

3.265s 

512 

1.352s 

1.269s 

1024 

10.729s 

6.055s 

1024 

2.305s 

1.538s 


Table 3. Times to view a list of summary information for 
conference manager stress tests, in Jacqueline and Django. 



Table 5. Showing all courses, with and without Early Prun¬ 
ing. 


Time to view single paper 

Time to view single user 

Papers 

Jacq. 

Django 

Users 

Jacq. 

Django 

8 

0.160s 

0.177s 

8 

0.164s 

0.158s 

16 

0.165s 

0.175s 

16 

0.164s 

0.159s 

32 

0.160s 

0.177s 

32 

0.164s 

0.159s 

64 

0.159s 

0.173s 

64 

0.164s 

0.159s 

128 

0.160s 

0.173s 

128 

0.167s 

0.158s 

256 

0.159s 

0.173s 

256 

0.163s 

0.159s 

512 

0.159s 

0.178s 

512 

0.169s 

0.162s 

1024 

0.161s 

0.173s 

1024 

0.163s 

0.159s 


Table 4. Times to view profiles for a single paper and single 
user, in Jacqueline and Django. 


a given type of data item. The graphs demonstrate that 
with both Jacqueline and Django, the time to load data 
scales linearly with respect to the underlying algorithms. The 
numbers (Table 6.3.1) show that Jacqueline has at most a 
1.75x overhead. The overhead comes from fetching both 
versions of data before resolving the policies. There is no 
solver overhead, as there are no mutual dependencies between 
sensitive values and policies. Note that these are truly stress 
tests: most systems will not load a thousand data rows at 
once, especially when each value has its own policy involving 
database queries. 

6.3.2 Representative Actions 

We increased the number of relevant database entries and mea¬ 
sured the time it takes to view the profiles for single papers 
and users. We show these numbers, as well as comparisons 
to Django, in Table 4. The time it takes to load these pro¬ 
files is under 2ms and roughly equivalent to the time it takes 
to do the equivalent action in Django. For viewing a single 
paper, Jacqueline actually performs better than the Django 
implementation. This is because in the Django code, the im¬ 
plementation needs iterate over collections of data rows again 
in order to apply policy checks. In the Jacqueline implemen¬ 
tation, the framework applies the policies and resolves each 
one once. Times for submitting a single paper scale similarly. 


6.3.3 Early Pruning Optimization 

We found the Early Pmning optimization to be necessary for 
nontrivial computations over sensitive values. In the course 
manager stress test, the page that shows all courses also looks 
up the instructors for each course, leading to blowup. We 
show in Table 5 how for just eight courses and instructors, the 
system begins to hit memory limits. Because Early Pruning 
can simplify other computations after the viewer is known, 
these computations are only problematic when they are used 
to compute the viewer. We do not expect such computations 
to be common. 

7. Limitations and Future Work 

The policy-agnostic approach does not protect against a 
malicious programmer who implements incorrect policies. 
By centralizing the policies and making the implementations 
more concise, however, we hope to make it easier to audit 
programs to determine policies are implemented correctly. 

The strategy of embedding the programming model in a 
Python web framework requires the programmer to enforce 
certain invariants. Because sensitive values exist unprotected 
in the program runtime and database, the programmer must 
access sensitive values only through the designated APIs. Im¬ 
plementing Jeeves in a language with private class attributes 
would alleviate some of these concerns. 

Our strategy of using two database rows to encode each 
faceted value means that we cannot use existing database 
support for performing aggregates or optimizing based on 
primary keys. A solution that would also reduce database 
size is to compute low-confidentiality values upon retrieving 
data from the database, rather than storing the values in the 
database. In cases when this is not possible, an alternate 
strategy is to implement user-defined functions in the database 
to optimize based on the Jeeves-based unique ID for each 
faceted value, as well as for aggregates. 

Another future direction involves optimizing queries to 
reduce the amount of data fetched based on the policies 
associated with the data. 

It would also be useful to extend policy-agnostic pro¬ 
gramming with faceted values to operating systems. We can 






build on the techniques from the Laminar system [39], which 
demonstrates how to dynamically enforce policies mediating 
access to resources such as files and sockets. 

8. Related Work 

Our approach builds on a long history of work in information 
flow control [6, 11, 13, 17, 20, 27, 28, 33, 36, 39, 45, 49]. 
The policy-agnostic approach differs from prior work in the 
following key way. Using prior approaches, the program¬ 
mer needs to implement the policy checks and Alters cor¬ 
rectly across the program. Our solution mitigates program¬ 
mer burden by leveraging the language runtime to produce 
outputs adhering to policies. This is similar in philosophy to 
angelic nondeterminism [10], program repair [40, 41], and 
acceptability-oriented computing [37, 38]. 

Prior work on information flow across the application- 
database boundary focuses on rejecting queries that leak 
information, rather than on modifying queries to enforce 
policies. SeLINQ [42], the work of Louren 90 and Caires [29], 
and Ur/Web use static types. DBTaint [18], Passe [9], and 
Hails [24] perform dynamic analysis. SIF [16] combines 
static labels and dynamic checks. There are also approaches 
based on symbolic execution [26], secure multi-execution [12, 
19, 22], and analysis of data provenance [5, 14] focused on 
rejecting programs that violate desired properties. 

Policy-agnostic programming differs from other ap¬ 
proaches in how data may affect control flow. Variational 
data structures [46] encapsulate properties related to pro¬ 
gram customization, but data does not affect control flow. 
Aspect-oriented programming [25, 43] has similar goals to 
policy-agnostic programming of separating program con¬ 
cerns, but aspects must be implemented at specific control 
flow points and cannot alter control flow. 

Our approach addresses information flow as opposed to ac¬ 
cess control [23, 31, 34], which prevents leaks at application 
endpoints and does not address indirect or implicit flows. Sim¬ 
ilarly, work on multi-level databases [21, 30] focuses on the 
storage and access control issues surrounding data at different 
levels of access in the database. 

9. Conclusions 

We demonstrate that it is practically feasible to achieve pol¬ 
icy compliance by construction in database-backed appli¬ 
cations. We present a technique for precise, dynamic infor¬ 
mation flow control that tracks sensitive values and policies 
through database queries and updates as well as application 
code. The technique supports a policy-agnostic programming 
model that allows the program to specify each information 
flow policy once, instead of as repeated intertwined checks 
across the program. The web framework performs different 
computations depending on the viewer, according to the poli¬ 
cies. The shift of responsibility to the framework reduces the 
opportunity for programmer error to cause information leaks. 


Our solution works with existing implementation of re¬ 
lational databases and yields formal guarantees across the 
application and database. We implement these ideas in the 
Jacqueline web framework and demonstrate that, compared 
to traditional applications with hand-coded policies, applica¬ 
tions written using Jacqueline have less policy code and run 
with often negligible overheads. This work makes a promis¬ 
ing step towards securing database-backed web applications. 
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A. Rules from 

We show the most relevant rules from the dynamic semantics 
for the Jeeves core language [7], 

A.1 Managing Labels 

These rules describe how to declare labels and attach policies 
to labels. The rule [f-label] dynamically allocates a label 
(label k in e), adding a fresh label to the store with the 
default policy of 'kx.true. Any occurrences of A: in e are 
a-renamed to k' and the expression is evaluated with the 
updated store. Policies may be further refined (restrict(A:,e)) 
by the mle [f-restrict], which evaluates e to a policy V that 
should be either a lambda or a faceted value comprised of 
lambdas. The additional policy check is restricted by pc, so 
that policy checks cannot themselves leak data. The rule joins 
the resulting policy check Vp with the existing policy for k, 
ensuring that policies can only become more restrictive. 

[E-LABEL] 

k' fresh 

E[k' ■='kx.true\,e[k \= k'] jj-pc E',y 
E, label k 'me IJ-pc E', V' 

[E-RESTRICT] 

E,eU.pcEi,y 

yp = {{pcU{k} ? y : Xx.true)) 
E'=Ei[fc:^Ei(fc)A/yp] 

E,restrict(k,e) fypc E',y 


A.2 Displaying Outputs 

The rule [f-print] handles print statements (print {ei} 62 ), 
where the result of evaluating 62 is printed to the channel 
resulting from the evaluation of ei. Both the channel Vf and 
the value to print Vc may be faceted values. The describes 
how to select the facets that correspond with our specified 
policies. The mle determines the set of relevant labels through 
the transitive closure function closeK. The labels are used to 
construct Cp from the relevant policies in the store E 2 . The 
rule evaluates Cp and applies it to Vf, returning the policy 
check Vp that is a faceted value containing booleans. The 
rule chooses a program counter pc such that the policies 
are satisfied. This corresponds to a label assignment that 
determines the channel / and the value to print R. 


[F-PRINT] 

E,ei U-0 Ei,yy 

El, 62 'IJ'0 ^2,Vc 

{ ki ... kn } = closeK{labels{ei) U labels{e 2 ),'^ 2 ) 

Cp = "kx.true Ay E 2 (A:i) A/ ... A/ T. 2 {k„) 

^2,6p Vf U.0 E3,yp 

pick pc such that pc{Vf) = f,pc{Vc) = R, pciVp) = true 
E,print {ei} e 2 \lVpJ-.R 

closeK{K,'L) = \t\.K' =1^^ labels{k{k)) \n 

ifK'=K 
then K 

else closeK{K' ,1.) 


B. Proof of Lemma 1 

Lemma 1 (A). 

L{{{klV,:V 2 ))) = 


L{Vi) ifkcL 
L{V 2 ) ifk^L 


Proof. By case analysis on the definition of ((k ? yi : y 2 )). 

Letx = L(((k?yi ;y2))). 

• If x = L{{kl Fi : F 2 )) for some non-table values Fi and 
F 2 , then this case holds since 

■ X = L{Fi) if kCL. 

• X = L{F2) if k^L. 

• If X = L{{{kl table T\ : table T 2 ))), then x = L(table T) 
where 

7’ = {(BU{k},5) I {B,s)cT,,^k^B} 

U {(BU{-k},5) I {B,s)cT2,ktfB}. 

And so 

x={(0,s) I (B,s) G 

U {(0,5) I {B,s)cT2,k^B,B\j{-^k}r^L}. 

•ffkcL, then B U {^k} 7 ^ L and 
B U {k} ^ L => ^k ^ B, and so 
x = 1(0,5) I {B,s)€Ti,B^L} 

= L(table Ti), as required. 



■ If A: ^ L, then this case holds by a similar argument as 
the previous case. 


□ 


• For case [f-assign], Va' where a' 7 ^ a,Z(a') = E'(a'). 
Since pc 9 ^ L,L{'L{a)) — L{T!{a)) by Lemma 1, as re¬ 
quired. 

□ 


C. Proof of Lemma 2 


Lemma 2 (B). 

L(((B?Vi:V2))) = | 


L(yi) ifB^L 
L{V2) if^iB^L) 


Proof. The proof is by induction and case analysis on the 
derivation of L{{{BlVi : ^ 2 )))- Let x = L(((B 1 V\ '-Vi))). 

• \f B then B ^ L, so x = L(yi) as required. 

• Otherwise, B~B'\J {A:}. 

■ If B ~ L, then 

x = L{{{kl{{B'lVi-.V2)):V2))) 

= L{({B' ? Vi : V2))) by Lemma 1, since kG L 
= L(Vi ) by induction, as B' ~ L. 

■ Otherwise, B f L, then 

— if k L, then x = B(y 2 ) by Lemma 1. 

— otherwise k G L, so B' f L. 

Therefore, x = L{{{B' ? Vi : V2))) = LiV2), as re¬ 
quired. 


This lemma is also useful in the proof of the Projection 
Theorem. 

E. Proof of Theorem 1 (Projection) 

For convenience, we restate Theorem 1. 

Suppose E,e E',y. Then for any view L for which pc is 
visible, 

L(E),L(e) ^0L(E'),L(y) 

The proof extends L to project evaluation contexts. They 
may project away the hole and so map evaluation contexts to 
expressions, in which case filling the result is a no-op. 

We capture in the following lemma the property that if a 
branch B is inconsistent with the program counter pc, at most 
one of B and pc may be visible to any given view L. 

Lemma 4. IfB is inconsistent with pc and pc ~ L, then B f L. 

With these properties established, we now prove projection. 

Proof By induction on the derivation of L(E),L(e) jj -0 
L(E'),L(y) and by case analysis on the final rule used in 
that derivation. 


□ 

D. Lemma 3 

If a set of branches is compatible with view L, then we can 
execute only using that view. We prove an additional lemma 
that if pc is not visible, then execution should not affect the 
environment under projections of L. 

Lemma 3 (C). If pc is not visible to L and 
E,e jj.pcS',y 

then L(L) = L(L'). 

Proof By induction on the derivation of E, e jj-pc E',y and by 
case analysis on the final rule used in that derivation. 

• The following cases hold because E = E': [f-val], 

[F-DEREF-NULL], [F-DEREF], [F-CTXT], [F-APP], [F-LEFT], 
[F-RIGHT], [F-ROW], [F-SELECT], [F-PROJECT], [F-JOIN], 
[F-UNION], [F-FOLD-EMPTY], and [F-FOLD-INCONSISTENT]. 

• Cases [F-APP], [F-LEFT], [F-RIGHT], [F-STRICT], [F-CTXT], 
and [F-FOLD-iNCONSiSTENT] hold by induction. 

• For case [f-split], we note that since pc ~ L, VA:.pcU 
{k] f L andpcU {^A:} f L. Therefore, this case also holds 
by induction. 

• Similarly, for case [f-fold-consistent], since pc f L, 
VB.pcUB f L, and so this case holds by induction. 

• For case [e-ref], Va' where a' f fl,E(fl') = E'(fl'). 

Since pc f L,L(E(a)) = 0 by Lemma 1, as required. 


• The following cases hold trivially: [f-val], [f-deref], 

[F-DEREF-NULL], [F-ROW], [F-PROJECT], and [F-UNION]. 

• For case [f-select], e = (table T), so 

'L,Oi=j (table T) (1^^ E, (table T') 
where T' = {(B,s) | s,- = Sj}. 

This case holds since L(table T) = {(0,S) | {B,s) GT,B ^ 
L}andL(table 7') = {(0,5) | {B,i) GT,B ^ L,Si = sj}, 

• For case [e-join], e = (table T\) txi (table T2), so 

E, (table Ti) cxi (table T2) jj-pc E, (table T) 
where T = {B.B',L?) | (B,s) e Tj, (B^?) G T2}. 

L{T) = {{B.B',s.s') I (B, 5 )G 7 ’i,(B',y)Gr 2 ,B.B'~L}, 
so this case holds. 

• For case [f-ctxt], e = E\e']. By the antecedents of this 
rule 

E^[] 

e' not a value 
E,e' jlpcEi.y' 

Ei,£[y'] ^pc^',V 

Note that L(£'[y']) = L{E)[L{y')\, etc., so by induction 
L(E),L(c')^ 0 B(Ei),L(y') 

L(Ei),L(£)[L(y')] ^ 0 L(E'),L(y) 

Therefore, L{lL),L{E[e\) jj .0 L(E'),L(y), as required. 

• For case [e-strict], e = B[(A: ? Vi : y 2 )]. By the an¬ 
tecedents of this rule 

E,(A:?B[yi]:B[y 2 ]) V 

We now consider each possible case for the next step in 
the derivation. 



■ For subcase [f-left], we know that k € pc,k € L and 

E, 5 [yi ]^0 E',y 

Byinduction,L(E),L((k ? 5[yi] : 5[y2]))|L0L(E'),L(y')- 

■ Subcase [f-right] holds by a similar argument. 

■ For subcase [f-split], k ^ pc,^k ^ pc and 

s,5[yi] i^p,u{k} 

^”av 2\ %cu{~.k} 

V = {{kiv” ■.V'”)) 

— If A: € L, then by induction we have L(E),L(5[yi]) IJ-e 

L(E"),E(y"). 

L(E") = L(E') by Lemma 3, and L{V) = L{V''). 
Therefore, L(E),L(5[yi]) IJ.® L{L'),L{V'), as re¬ 
quired. 

— If A: ^ L, then this case holds by a similar argument. 
For case [f-fold-empty], we have 

E,fold Vf Vb (table e) i}-pc E,Vi 
Clearly,L(E),fold L(yy) L{Vb) L(table e) IJ-eL(E),L(V*). 

For case [f-fold-inconsistent], we have 
e = fold Vf Vp (table {B,s).T) 

. By the antecedents of this rule, we have 

E,fold Vf Vb (table T) V E',y 
B is inconsistent with pc 
By Lemma 4, B 9 ^ L. 

Therefore, L(table {B,s).T) =L(table T). 

By the [f-fold-empty] rule, 

L(E),foldL(y^) L{Vb) L(table (B,s).r) II 0 L(E'),L(y) 

By induction,L(E),L(fold F/-Vfo (table T)) lJ. 0 L(E'),L(y), 
as required. 

For case [f-fold-consistent], e = fold Vf Vb (table T). 
By the antecedents of this rule, we have 

E, fold Vf Vb (table T ) ij-pc Ei, Fi 
B is consistent with pc 
EijF/ s Fi ij-pcuB '^',V2 
F = ((B ? F 2 : Fi)) 

■ If B ~ L, then pcUB ^ L. 

By induction, 

L(E),L(fold Vf Vb (table T)) L(Ei),L(Fi) 
L(Ei),L(F^SFi )^0 L(E'),L(F 2 ) 

By Lemma 2, L(F) = L(((B ? F 2 : Fi))), as required. 

■ Otherwise, B L, and therefore pc Li B L. By 
Lemma 3, L(Ei) = L(E'). 

We have L(E),L(foId Vf Vb (table T)) ij-m L(Ei),L(Fi) 
by induction. 

L(table (B,s).7’) = L(table T). 

By Lemma 2, L(F) = L(((B ? F 2 : Fi))), as required. 


• For case [f-left], e = {kl e\ : ej). 

By the antecedents of this rule, we have 
k G pc 

E,ei i^pc E',F 
Since k € pc, L{e) = L{ei). 

By induction, L(E),L(ei) U .0 L(E'),L(F). 

• Case [F-RiGHT] holds by a similar argument. 

• For case [f-split], e= {kl e\: ei). 

By the antecedents of this rule, we have 

k ^ pc ^k ^ pc 

V = {{k 7 Vi:V 2 )) 

•IfkGL, then by induction L(E),L(ei) IJ .0 L(Ei),L(Fi). 
L(Ei) = L(E') by Lemma 3, and by Lemma 1 
L(F) = L{{{k ? Fi ; F 2 ))) =L(Fi), as required. 

■ Otherwise ^k € L, so L(E) = L(Ei) by Lemma 3. 

By induction, L(Ei),L(e 2 ) -Ho L{1.'),L{V2), 

and by Lemma I L(F) = L{{{k ? Fi : F 2 ))) = L(F 2 ), 

as required. 

• For case [f-app], e = {'kx.e' V'). By the rule antecedents, 

E,e'[x:=F'] V 

We know that L{e) = LQcx.e' V') = L{e'[x := F']). 

By induction, L{T.),L(e'[x := V']) IJ .0 L(E'),L(F), as re¬ 
quired. 

• For case [f-ref], e = ref V' . By the rule antecedents 

a ^ domiY.) 

E' = E[a:=((pc?F':0))] 

Without loss of generality, we assume that both evalua¬ 
tions allocate the same address a. Since a ^ dom{Y),a ^ 
dom{L{Y)). 

Also, we know that Va' G dom{Y),Y{a') = E'(a'), and 
therefore L(E(a')) = L(E'(a')). 

Since pc - L, L{Y'{a)) = L{{{pc ? V' : 0))) = L(F') by 
Lemma 2. Since L(((0 ? V' : 0))) = L(F') = L(F), this 
case holds. 

• For case [f-assign], e = (a:=V). By the antecedent 
of this rule, E' = E[a := ((pc ? F : E(a)))]. We know 
Va' € dom(Y),Y(a') = Y'(a'), and therefore L{L(a')) = 
L(Y'(a')). 

Since L ~ pc, L(Y'(a)) = L(((pc ? F : E(a)))) = L(F) by 
Lemma 2. And since L(((0 ? F ; E(a)))) = B(F), this case 
holds. 

□ 



